Using Shapefiles with rgdal and sf

There are loads of spatial mapping/plotting packages in R. The main two ways to read in spatial data use the rgdal package, and the sf package. Let’s look at how to load/plot line and polygon data.

Let’s load packages first:

Polyline Shapefile Data

For this example, let’s use some Hydroshed Data.

I’ve downloaded rivers for CA and OR and put them here on github. Download the zipped file and unzip it in a data folder. We’re going to use shapefiles for the remainder of this example.

Load shapefiles with rgdal

Let’s load a polyline or line shapefile of rivers of California and Oregon. The result is a SpatialLinesDataFrame in your R environment.

## Source: "/Users/ryanpeek/Documents/github/teaching/mapping-in-R-workshop/data", layer: "rivs_CA_OR_hydroshed"
## Driver: ESRI Shapefile; number of rows: 21976 
## Feature type: wkbLineString with 2 dimensions
## Extent: (-124.5429 32.58542) - (-114.1315 46.26042)
## CRS: +proj=longlat +datum=WGS84 +no_defs  
## LDID: 87 
## Number of fields: 26 
##          name type length typeName
## 1       ARCID    0      9  Integer
## 2    UP_CELLS    2     24     Real
## 3     statefp    4     80   String
## 4     statens    4     80   String
## 5    affgeoid    4     80   String
## 6       geoid    4     80   String
## 7      stusps    4     80   String
## 8        name    4     80   String
## 9        lsad    4     80   String
## 10      aland    2     24     Real
## 11     awater    2     24     Real
## 12 state_name    4     80   String
## 13 state_abbr    4     80   String
## 14 jurisdicti    4     80   String
## 15  statefp.1    4     80   String
## 16  statens.1    4     80   String
## 17 affgeoid.1    4     80   String
## 18    geoid.1    4     80   String
## 19   stusps.1    4     80   String
## 20     name.1    4     80   String
## 21     lsad.1    4     80   String
## 22    aland.1    2     24     Real
## 23   awater.1    2     24     Real
## 24 state_na_1    4     80   String
## 25 state_ab_1    4     80   String
## 26 jurisdic_1    4     80   String
## OGR data source with driver: ESRI Shapefile 
## Source: "/Users/ryanpeek/Documents/github/teaching/mapping-in-R-workshop/data", layer: "rivs_CA_OR_hydroshed"
## with 21976 features
## It has 26 fields
## [1] "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"

Load shapefiles with sf

Here’s how to do the same thing using the sf package. Notice two important differences, the sf package reads things in as a regular dataframe, with the spatial component of the data contained inside a geometry list-column within the dataframe. That means you can operate on this data as you would any data frame. The other main difference, is that reading shape data in is much faster with sf.

## Reading layer `rivs_CA_OR_hydroshed' from data source `/Users/ryanpeek/Documents/github/teaching/mapping-in-R-workshop/data/rivs_CA_OR_hydroshed.shp' using driver `ESRI Shapefile'
## Simple feature collection with 21976 features and 26 fields
## geometry type:  MULTILINESTRING
## dimension:      XY
## bbox:           xmin: -124.5429 ymin: 32.58542 xmax: -114.1315 ymax: 46.26042
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
## Coordinate Reference System:
##   EPSG: 4326 
##   proj4string: "+proj=longlat +datum=WGS84 +no_defs"

Polygon Shapefile Data

No different here, process is the same. But let’s take a look at a package that might be helpful for folks working with state/county boundaries.

Download State/County Data

A nice package for pulling county/state data is the USAboundaries package. Importantly, this package pulls these data in as sf features (dataframes), not as rgdal or SpatialPolygonDataFrames data.

Let’s show this in two steps, the first is how to grab a sf feature for a given state or states.

## Coordinate Reference System:
##   EPSG: 4326 
##   proj4string: "+proj=longlat +datum=WGS84 +no_defs"

That was easy…what about counties? We can use the same type of call, but let’s add some dplyr and purrr functionality here to add the X and Y values for the centroid of each polygon (county) we download. In this case we use map_dbl because it will take a vector or values (the geometry col here), map a function over each row in that vector, and return a vector of values (the centroid points).


Advanced Mapping

Okay, let’s add in the county/state data, and figure out a few tricks to making our map a bit cleaner.

Cropping & Buffering with sf

Great thing is that sf has some really nice tools for this, just as any GIS program would. Here let’s use st_intersection to crop our river layer to only rivers in the counties of interest.

## although coordinates are longitude/latitude, st_intersection assumes that they are planar
## Warning: attribute variables are assumed to be spatially constant
## throughout all geometries

## Warning in st_buffer.sfc(st_geometry(x), dist, nQuadSegs, endCapStyle =
## endCapStyle, : st_buffer does not correctly buffer longitude/latitude data
## dist is assumed to be in decimal degrees (arc_degrees).

Great, let’s try our plot again.

Ok! Not bad. What can we improve?


Using ggplot2 and sf

We can use ggplot2 to plot our sf data! Many more options using this platform for sf. Mapping with ggplot2 brings some extra things we can fiddle with. Since these data are all data.frames (sf features), we can use the geom_sf function.

One important note…there’s a common error that you may get semi-regularly, which will look something like this:

Error in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y,  : 
  polygon edge not found

Don’t worry about this, it has to do with the grid package…it’s annoying but all you need to do is replot your map/figure (sometimes several times) and eventually the plot will work/show up.

Great, this is a nice example…but what if we want to crop to the area of interest like we did previously with the st_bbox? Well, we can add a call to the coord_sf() to limit the range we’re interested in. Let’s also add an “annotation” to our map to delineate the CA state line.

Interactive maps

We can add our sf data directly into a leaflet map using the mapview package.

Put it All Together

  • Can you crop to a single county and plot the rivers and county?
  • How might you make an inset in your map? (hint…see here)
  • What about buffering outside of the selected counties by 30 km?
  • Can you add some points? Try adding a point at 39.4 N and 121.0 W.